A New Value Iteration Method for the Average Cost Dynamic Programming Problem∗

نویسنده

  • DIMITRI P. BERTSEKAS
چکیده

We propose a new value iteration method for the classical average cost Markovian decision problem, under the assumption that all stationary policies are unichain and that, furthermore, there exists a state that is recurrent under all stationary policies. This method is motivated by a relation between the average cost problem and an associated stochastic shortest path problem. Contrary to the standard relative value iteration, our method involves a weighted sup-norm contraction, and for this reason it admits a Gauss–Seidel implementation. Computational tests indicate that the Gauss–Seidel version of the new method substantially outperforms the standard method for difficult problems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fully Fuzzy Transportation Problem

Transportation problem is a linear programming which considers minimum cost for shipping a product from some origins to other destinations such as from factories to warehouse, or from a warehouse to supermarkets. To solve this problem simplex algorithmis utilized. In real projects costs and the value of supply and demands are fuzzy numbers and it is expected that optimal solutions for determini...

متن کامل

Dynamic Multi Period Production Planning Problem with Semi Markovian Variable Cost (TECHNICAL NOTE)

This paper develops a method for solving the single product multi-period production-planning problem, in which the production and the inventory costs of each period arc concave and backlogging is not permitted. It is also assumed that the unit variable cost of the production evolves according to a continuous time Markov process. We prove that this production-planning problem can be Stated as a ...

متن کامل

A New Approach to Distribution Fitting: Decision on Beliefs

We introduce a new approach to distribution fitting, called Decision on Beliefs (DOB). The objective is to identify the probability distribution function (PDF) of a random variable X with the greatest possible confidence. It is known that f X is a member of = { , , }. 1 m S f L f To reach this goal and select X f from this set, we utilize stochastic dynamic programming and formulate this proble...

متن کامل

General Dynamic Programming Algorithmsapplied to Polling

We formulate the problem of scheduling a single server in a multi-class queue-ing system as a Markov decision process under the discounted cost and the average cost criteria. We develop a new implementation of the modiied policy iteration (MPI) dynamic programming algorithm to eeciently solve problems with large state spaces and small action spaces. This implementation has an enhanced policy ev...

متن کامل

A New Multi-Objective Model for Dynamic Cell Formation Problem with Fuzzy Parameters

This paper proposes a comprehensive, multi-objective, mixed-integer, nonlinear programming (MINLP) model for a cell formation problem (CFP) under fuzzy and dynamic conditions aiming at: (1) minimizing the total cost which consists of the costs of intercellular movements and subcontracting parts as well as the cost of purchasing, operation, maintenance and reconfiguration of machines, (2) maximi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995